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[57] ABSTRACT 

A computer system includes an apparatus which enables 
transactions directed to a particular target device such as one 
situated inside a bridge to be shunted directly to the device 
without requiring that the transaction actually proceed to the 
device through a bus on which the device is located. 
However, the transaction may, in fact, also be run on the bus 
on which the device is located, the ID select for the target 
device may be masked. In this way, it is possible to run 
transactions to a particularly critical device even when the 
bus on which it is located is, for one reason or another, not 
operating. 
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FAULT TOLERANT COMPUTER SYSTEM native bus speed. A 64-bit, 66-MHz PCI bus has a theoretical 

maximum transfer rate of 528 MByte/sec. All read and write 

CROSS-REFERENCE TO RELATED transfers over the bus can be burst transfers. The length of 

APPLICATION the burst can be negotiated between initiator and target 

^ . . ... ,-rr~ .. . 5 devices, and can be any length. 
Inis is a continuation -l n-p art or U.S. patent application 

Ser. No. 08/658,750, now U.S. Pat. No. 6,032,271 filed on Sy? tcm . and com P°. nent manufacturers have implemented 

Jun 5 1996 interfaces in various ways. For example, Intel 

Corporation manufactures and sells a PCI Bridge device 

FIELD OF THE INVENTION under the part number 82450GX, which is a single-chip 

io host-to-PCI bridge, allowing CPU-to-PCI and PCI-to-CPU 

This invention relates generally to computer systems with transactions, and permitting up to four P6 processors and 

bus-to-bus bridges, and particularly to computer systems two PCI bridges to be operated on a system bus. Another 

that can continue to operate after hardware or software faults example, offered by VLSI Technology, Inc., is a PCI chipset 

occur. under the part number VL82C59x SuperCore, providing 

15 logic for designing a Pentium based system that uses both 

BACKGROUND OF THE INVENTION ra arld ISA buses. The chipset includes a bridge between 

Computer systems of the PC type usually employ a the host bus and the PCI bus, a bridge between the PCI bus 

so-called expansion bus to handle various data transfers and a n6 J hc ISA bus, and a PCI bus arbiter. Posted memory write 

transactions related to I/O and disk access. The expansion nn buffers are P rovided 10 bot h bridges, and provision is made 

bus is separate from the system bus or from the bus to which for Pentium s P^cd bus cycles and burst transactions, 

the processor is connected, but is coupled to the system bus The "Pentium Pro" processor, commercially available 

by a bridge circuit. from Intel Corporation uses a processor bus structure as 

For some time, all PC's employed the ISA (Industry defined in the specification for this device, particularly as set 

Standard Architecture) expansion bus, which was an 8-Mhz, 25 f orlh m the Publication "Pentium Pro Family Developer's 

16-bit device (actually clocked at 8.33 Mhz). Using two MaDual Vols - l ~ 3 > Intel Cor P" 1996 > *™teble from 

cycles of the bus clock to complete a transfer, the theoretical McGraw-Hill, and incorporated herein by reference; this 

maximum transfer rate was 8.33 Mbytes/sec. Next, the EISA manual 15 also available from Intel by accessing <http:// 

(Extension to ISA) bus was widely used, this being a 32-bit www.intel.com>. 

bus clocked at 8-Mhz, allowing burst transfers at one per 30 A CPU operates at a much faster clock rate and data 
clock cycle, so the theoretical maximum was increased to 33 access rate than most of the resources it accesses via a bus. 
Mbytes/sec. As performance requirements increased, with In eariier processors, such as those commonly available 
faster processors and memory, and increased video band- wnen trj e ISA bus and EISA bus were designed, this delay 
width needs, a high performance bus standard was a neces- in reading data from a resource on the bus was handled by 
sity. Several standards were proposed, including a Micro 35 wait states - Wnen a processor requested data that was not 
Channel architecture which was a 10-Mhz, 32-bit bus, immediately available due to a slow memory or disk access, 
allowing 40 MByte/sec, as well as an enhanced Micro then the processor merely marked time using wait states, 
Channel using a 64-bit data width and 64-bit data streaming, doin S no usefM work, until the data finally became avail- 
theoretically permitting 80-to-160 Mbyte/sec transfer. The able - In order to make use of this delay time, a processor 
requirements imposed by the use of video and graphics 40 SUCD as tne p6 provides a pipelined bus that allows multiple 
transfer on networks, however, necessitate even faster trans- transactions to be pending on the bus at one time, rather than 
fer rates. One approach was the VESA (Video Electronics requiring one transaction to be finished before starting 
Standards Association) bus which was a 33 Mhz, 32-bit local mother. Also, the P6 bus allows split transactions, i.e., a 
bus standard specifically for a 486 processor, providing a request for data may be separated from the delivery of the 
theoretical maximum transfer rate of 132 Mbyte/sec for 45 data b y other transactions on the bus. The P6 processor uses 
burst, or 66 Mbyte/sec for non-burst; the 486 had limited a technique referred to as a "deferred transaction" to accom- 
burst transfer capability. The VESA bus was a short-term P^h the split on the bus. In a deferred transaction, a 
solution as higher-performance processors, e.g., the Intel P5 processor sends out a read request, for example, and the 
and P6 or Pentium and Pentium Pro processors, became the target sends back a "defer" response, meaning that the target 
standard. 50 w^I send the data onto the bus, on its own initiative, when 
The PCI (Peripheral Component Interconnect) bus was * e data becomes available. Another transaction available on 
proposed by Intel as a longer-term solution to the expansion me * 6 bus 13 a res P° Dse - If a tar 8 el 15 not able 10 
bus standard, particularly to address the burst transfer issue. su PPly a requested item, the target may respond to the 
The original PCI bus standard has been upgraded several rec l ues L t from the Pressor using a retry signal, and in that 
times, with the current standard being Revision 2.1, avail- 55 case the P roces sor will merely send the request again the 
able from a trade association group referred to as PCI nexl time u has access 10 ^ bus * 

Special Interest Group, P.O. Box 14070, Portland, Oreg. The PCI bus specification as set forth above does not 

97214. The PCI Specification, Rev. 2.1, is incorporated provide for deferred transactions. There is no mechanism for 

herein by reference. Construction of computer systems using issuing a "deferred transaction" signal, nor for generating 

the PCI bus, and the PCI bus itself, are described in many 60 the deferred data initiative. Accordingly, while a P6 proces- 

publications, including "PCI System Architecture," 3rd Ed., sor can communicate with resources such as main memory 

by Shanley et al., published by Addison -Wesley Pub. Co., lhat are on the processor bus itself using deferred 

also incorporated herein by reference. The PCI bus provides transactions, this technique is not employed when commu- 

for 32-bit or 64-bit transfers al 33- or 66-Mhz; it can be nicating with disk drives, network resources, compatibility 

populated with adapters requiring fast access to each other 65 devices, etc., on an expansion bus. 

and/or with system memory, and that can be accessed by the In existing computer systems read and write transactions 

host processor at speeds approaching that of the processor's commonly run from an initiator on one bus to a target on 
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another bus. These transactions commonly traverse through a transaction from the initiator directed to the target on the 
a bus-to-bus bridge which connects the two buses. A bus bus. The transaction is received in a bridge which also 
may contain a number of slots which may be filled by includes the target. The transaction is issued from the bridge 
devices which are potential initiators or targets. A number of to the bus. The transaction is also driven directly to the target 
problems may arise which cause a particular bus to become 5 without using the bus. 
inoperable. One common situation is for a bus hang condi- 
tion to arise which may occur, for example, in the common BRIEF DESCRIPTION OF THE DRAWINGS 
IRDY bus hang situation. Once the bus recognizes an error FIG. 1 is a block diagram of one illustrative system that 
condition the transaction which gave rise to the error could CQuld ^ lemem the nt mvention; 

be aborted. However, this may not always cure the problem. 10 ___ 

rp» u u l j * t_i i _i i • .i. c .i_ FIG. 2 is a block diagram of the primary and secondary 

Thus, it would be desirable to determine the cause of the bridees shown in FIG 1* 

problem and to attempt to overcome it if possible. This type * ' ' 

of diagnostic procedure may be complicated by the fact that FIG * 3fl ~ 3 ^ are timm S diagrams showing events occur- 
it is necessary to access the troubled bus in order to obtain on the buses m ^ s > rstem of FIG * 
information about the nature of the problem which has 15 FIG. 4 is a block diagram corresponding to FIG. 2; and 
occurred. For example, devices which are on the bus may FIG. 5 is a block diagram of one implementation of the 
contain information about the transactions which occurred present invention, 
previously. This information may provide useful informa- 
tion for determining the source of the problem and perhaps DESCRIPTION OF A PREFERRED 
even the nature of the problem. When the bus is inoperative, 20 EMBODIMENT 
there may be no way for the internal system to determine Referring to FIG. 1, a computer system 10 is shown which 
how to correct itself. As a result, many error conditions may ^ features of thc inverltion> according t0 one embodi- 
result in system crashes. System crashes generally necessi- mcnt ^ tem muhi lc proccssors u 12 , i3 
tate a visit from a repair technician and often entail consid- and u m this cxamplc ^ altho h the im p rovcme nts may be 
erable downtime tor the entire system. 25 ^ 

in a single processor environment. The processors are 

Another issue which arises in many current computer 0 f mc type manufactured and sold by Intel Corporation 

systems involves bridges which include devices which may under the trade Dame "Pentium Pro," although the proces- 

be either initiators or targets of transactions being run on sors are also referred to as "P6" devices. The structure and 

particular buses. Generally, a transaction passing through a operation of these processors 11, 12, 13, and 14 are 

bridge is run on a connected bus. Because of the way the 30 described in detail in the above-mentioned Intel 

buses operate, a signal is sent out on the bus to a target publications, as well as in numerous other publications, 

device but the signal also proceeds to the end of the bus and . , . . i _ , 

n * j i_ t * i l. i_ -i • • Th e processors are connected to a processor bus 15 which 

is reflected back. The signal that the target device receives • M „L„n., „c „ a u *u 

... _ , : P7 . , - - a , is generally or the structure specified by the processor 

is a combination of the initial wave and the reflected wave. * c , . . ... r» ** n c *- tl 

, , . , , . « specification, in this case a Pentium Pro specification. The 

As a result, the signal integnty of the received signals may , , - f iU „ ™™ i, ■* <u 

, . r j . . « i_ ^ . ,f L J bus 15 operates from the processor clock, so if the proces- 

be less for devices resident on the bridge itself, because sors afe 16g MHz of 20Q MHz deyi fof u then ^ 

those devices receive the reflected s.gnal with the longest bus 15 h ^ 0Q ^ ^ , e of ^ ^ ^ fa(e 

delay As a result, the signal received by bridge resident ^ main m fa shown 00nnec|ed , 0 ^ rocessor 5us 

targets may have m.egnty problems because of the consid- 15 and a me ^ J DRAM 

erable delay between receipt of the initial signal and the 4U momn „, ^n it n u n „A ia ^„„u u_ n 

reflected si nal memory 17. The processors 11, 12, 13, and 14 each have a 

* level-two cache L2 as a separate chip within the same 

There is a considerable need for a computer system which package as the CPU chip itself, and of course the CPU chips 

facilitates the correction of bus errors and which improves nave i eV el^one LI data and instruction caches included 

the integrity of bus signals. ^ on _ cnip< 

SUMMARY OF THE INVENTION According to the invention, a bridge 18 or 19 is provided 

between the processor bus 15 and a PCI bus 20 or 21. Two 

In accordance with one aspect of the present invention, a bridges 18 and 19 are shown> althoug|l it is understood that 

computer system includes a processor and a bridge commu- many systems wou ld require only one, and other systems 

mealing with the processor. There is a target and an initiator 5Q may use more than two In one examplej up to four of the 

on a bus. A communication path is provided for transactions bridges may be usecL ^ reas0Q for using more than one 

from said initiator directly to the target without using the bridge ^ to mcrease mc pot ential data throughput. A PCI 

k us ' bus, as mentioned above, is a standardized bus structure that 

In accordance with another aspect of the present is built according to a specification agreed upon by a number 

invention, a bridge for a computer system includes an 5S of equipment manufacturers so that cards for disk 

initiator and a target connectable to the same bus and located controllers, video controllers, modems, network cards, and 

within the bridge. A path for communicating bus transac- the like can be made in a standard configuration, rather than 

tions directly to the target without using the bus is provided. having to be customized for each system manufacturer. One 

In accordance with still another aspect of the present of the bridges 18 or 19 is the primary bridge, and the 

invention, a method of processing transactions between an 60 remaining bridges (if any) are designated secondary bridges, 

initiator and a target on a bus includes the step of initiating The primary bridge 18 in this example carries traffic for the 

a transaction from the initiator and receiving the transaction "legacy" devices such as (E)ISA bus, 8259 interrupt 

to a bridge. The transaction is then driven directly to the controller, VGA graphics, IDE hard disk controller, etc. The 

target without using the second bus. secondary bridge 19 does not usually incorporate any PC 

In accordance with yet another aspect of the present 65 legacy items, 

invention, a method of processing transactions between an All traffic between devices on the concurrent PCI buses 20 

initiator and a target on a bus includes the step of initiating and 21 and the system memory 17 must traverse the pro- 
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cessor bus 15. Peer-to-peer transactions are allowed between according to Appendix B. Internally, the bridge is divided 

a master and target device on the same PCI bus 20 or 21; into an upstream queue block 45 (US QBLK) and a down- 

these are called "standard" peer-to-peer transactions. Trans- stream queue block 46 (DS QBLK). The term downstream 

actions between a master on one PCI bus and a target device means any transaction going from the processor bus 15 to 

on another PCI bus must traverse the processor bus 15, and 5 the PCI bus 20, and the term upstream means any transaction 

these are "traversing" transactions; memory and I/O reads going from the PCI bus back toward the processor bus 15. 

and writes are allowed in this case but not locked cycles and The bridge interfaces on the upstream side with the proces- 

some other special events. sor bus 15 which operates at a bus speed related to the 

In an example embodiment as seen in FIG. 1, PC legacy processor clock rate which is, for example, 133 MHz, 166 

devices are coupled to the PCI bus 20 by an (E)ISA bridge 10 MHz, or 200 MHz for Pentium Pro processors, whereas it 

23 to an EISA/ISA bus 24. Attached to the bus 24 are interfaces on the downstream side with the PCI bus which 

components such as a controller 25 (e.g., an 8042) for operates at 33 or 66 MHz. Thus, one function of the bridge 

keyboard and mouse inputs 26, flash ROM 27, NVRAM 28, 18 is that of a buffer between asynchronous buses, and buses 

and a controller 29 for floppy drive 30 and serial/parallel which differ in address/data presentation, i.e., the processor 

ports 31. A video controller 32 for a monitor 33 is also 15 bus 15 has separate address and data lines, whereas the PCI 

connected to the bus 20. On the other PCI bus 21, connected bus uses multiplexed address and data lines. To accomplish 

by bridge 19 to the processor bus 15, are other resources these translations, all bus transactions are buffered in 

such as a SCSI disk controller 34 for hard disk resources 35 FIFO's. 

and 36, and a network adapter 37. A network 38 is accessed For transactions traversing the bridge 18, all memory 
by the adapter 37, and a large number of other stations 20 writes are posted writes and all reads are split transactions, 
(computer systems) 39 are coupled to the network. Thus, A memory write transaction initiated by a processor device 
transactions on the buses 15, 20, and 21 may originate in or 0 n the processor bus 15 is posted to the interface 43 of FIG. 
be directed to another station or server 39 on the network 38. 2 and the processor goes on with instruction execution as if 
The embodiment of FIG. 1 is that of a server, rather than a the write had been completed. A read requested by a 
standalone computer system, but the bridge features can be 25 processor 11-14 is not implemented at once, due to mis- 
used as well in a workstation or standalone desktop com- match in the speed of operation of all of the data storage 
puter. The controllers such as 32, 34, and 37 would usually devices (except for caches) compared to the processor speed, 
be cards fitted into PCI bus slots on the motherboard. If so the reads are all treated as split transactions in some 
additional slots are needed, a PCI-to-PCI bridge 40 may be manner. An internal bus 47 conveys lprocessor bus write 
placed on the PCI bus 21 to access another PCI bus 41; this 30 transactions or read data from the interface 43 to a down- 
would not provide additional bandwidth, but would allow stream delayed completion queue DSDCQ 48 and a RAM 
more adapter cards to be added. Various other server 49 for this queue, or to a downstream posted write queue 50 
resources can be connected to the PCI buses 20, 21, and 41, and a RAM 51 for this queue. Read requests going down- 
using commercially -available controller cards, such as stream are stored in a downstream delayed request queue 
CD-ROM drives, tape drives, modems, connections to ISDN 35 DSDRQ 52. An arbiter 53 monitors all pending downstream 
lines for internet access, etc. pos t e d writes and read requests via valid bits on lines 54 in 
The processor bus 15 contains a number of standard the downstream queues and schedules which one will be 
signal or data lines as defined in the specification for the allowed to execute next on the PCI bus according to the read 
Pentium Pro or P6 processor, mentioned above. In addition, and write ordering rules set forth in the PCI bus specifica- 
certain special signals are included for the unique operation 40 tion. Commands to the interface 44 from the arbiter 53 are 
of the bridges 18 and 19, as will be described. The bus 15 on lines 55. 

contains thirty-three address lines 15a, sixty-four data lines The components of upstream queue block 45 are similar 
15b, and a number of control lines 15c. Most of the control to those of the downstream queue block 46, i.e., the bridge 
lines are not material here and will not be referred to; also, 18 is essentially symmetrical for downstream and upstream 
data and address signals have parity lines associated with 45 transactions. A memory write transaction initiated by a 
them which will not be treated here. The control signals of device on the PCI bus 20 is posted to the PCI interface 44 
interest here are described in Appendix A, and include the of FIG. 2 and the master device proceeds as if the write had 
address strobe ADS#, data ready DRDY#, lock LOCK#, been completed. A read requested by a device on the PCI bus 
data busy DBSY#, defer DEFER#, request command REQ 20 is not implemented at once by a target device on the 
[4:0]# (five lines), response status RS[2:0]#, etc. 50 processor bus 15, so these reads are again treated as delayed 
The PCI bus 20 (or 21) also contains a number of standard transactions. An internal bus 57 conveys PCI bus write 
signal and data lines as defined in the PCI specification. This transactions or read data from the interface 44 to an 
bus is a multiplexed address/data type, and contains sixty- upstream delayed completion queue USDCQ 58 and a RAM 
four AD lines 20a, eight command/byte-enable lines 20&, 59 for this queue, or to an upstream posted write queue 60 
and a number of control lines 20c as will be described. The 55 and a RAM 61 for this queue. Read requests going upstream 
definition of the control lines of interest here is given in are stored in an upstream delayed request queue USDRQ 62. 
Appendix B, including initiator ready IRDY#, lock An arbiter 63 monitors all pending upstream posted writes 
P_LOCK#, target ready TRDY#, STOP#, etc. In addition, and read requests via valid bits on lines 64 in the upstream 
there are PCI arbiter signals 20a*, also described in Appendix queues and schedules which one will be allowed to execute 
B, including request REQx#, grant P_GNTx#, MEMACK#, 60 next on the processor bus according to the read and write 
etc. ordering rules set forth in the PCI bus specification. Corn- 
Referring to FIG. 2, the bridge circuit 18 (or 19) is shown mands to the interface 43 from the arbiter 63 are on lines 65. 
in more detail. This bridge includes an interface circuit 43 The structure and functions of the FIFO buffers or queues 
serving to acquire data and signals from the processor bus 15 in the bridge 18 will now be described. Each buffer in a 
and to drive the processor bus with signals and data accord- 65 delayed request queue, i.e., DSDRQ 52 or USDRQ 62, 
ing to Appendix A. An interface 44 serves to drive the PCI stores a delayed request that is waiting for execution, and 
bus 20 and to acquire signals and data from the PCI bus this delayed request consists of a command field, an address 
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field, a write data field (not needed if this is a read request), 
and a valid bit. The upstream USDRQ 62 holds requests 
originating from masters on the PCI bus and directed to 
targets on the processor bus 15 and has eight buffers (in an 
example embodiment), corresponding one-to-one with eight 5 
buffers in the downstream delayed completion queue 
DSDCQ 48. The downstream delayed request queue 
DSDRQ 52 holds requests originating on the processor bus 
15 and directed to targets on the PCI bus 20 and has four 
buffers, corresponding one-to-one with four buffers in the 10 
upstream delayed completion queue USDCQ 58, The 
DSDRQ 52 is loaded with a request from the interface 43 via 
bus 72 and the USDCQ 58. Similarly, the USDRQ 62 is 
loaded from interface 44 via bus 73 and DSDCQ 48. The 
reason for going through the DCQ logic is to check to see if 15 
a read request is a repeat of a request previously made. Thus, 
a read request from the bus 15 is latched into the interface 
43 in response to an ADS#, capturing an address, a read 
command, byte enables, etc. This information is applied to 
the USDCQ 58 via lines 74, where it is compared with all 20 
enqueued prior downstream read requests; if it is a duplicate, 
this new request is discarded if the data is not available to 
satisfy the request, but if it is not a duplicate, the information 
is forwarded to the DSDRQ 52 via bus 72. The same 
mechanism is used for upstream read requests; information 2 s 
defining the request is latched into interface 44 from bus 20, 
forwarded to DSDCQ 48 via lines 75, and if not a duplicate 
of an enqueued request it is forwarded to USDRQ 62 via bus 
73. 

Hie delayed completion queues each include a control 
block 48 or 58 and a dual port RAM 49 or 59. Each buffer 
in a DCQ stores completion status and read data for one 
delayed request When a delayable request is sent from one 
of the interfaces 43 or 44 to the queue block 45 or 46, the 
first step is to check within the DCQ 48 or 58 to see if a 
buffer for this same request has already been allocated. The 
address and the commands and byte enables are checked 
against the eight buffers in DCQ 48 or 58. If not a match, 
then a buffer is allocated (if one is available) the request is 
delayed (or deferred for the bus 15), and the request is 
forwarded to the DRQ 52 or 62 in the opposite side via lines 
72 or 73. This request is run on the opposite bus, under 
control of the arbiter 53 or 63, and the completion status and 
data are forwarded back to the DCQ 48 or 58 via bus 47 or 

57. After status/data are placed in the allocated buffer in the 
DCQ in this manner, this buffer is not valid until ordering 
rules are satisfied; e.g., a write cannot be completed until 
previous reads are completed. When a delayable request 
"matches" a DCQ buffer and the requested data is valid, then 
the request cycle is ready for immediate completion. 

The downstream DCQ 48 stores status/read data for 
PCI-to-host delayed requests, and the upstream DCQ 58 
stores status/read data for Host-to-PCI delayed or deferred 
requests. Each DSDCQ buffer stores eight cache lines (256- 
bytes of data), and there are eight buffers (total data 
storage- 2 K-Byte). The four buffers in the upstream DCQ 

58, on the other hand, each store only 32-Bytes of data, a 
cache line (total data storage-128-Bytes). The upstream and 
downstream operation is slightly different in this regard. The 
bridge control circuitry causes prefetch of data into the 
DSDCQ buffers 48 on behalf of the master, attempting to 
stream data with zero wait states after the delayed request 
completes. DSDCQ buffers are kept coherent with the host 
bus 15 via snooping, which allows the buffers to be dis- 
carded as seldom as possible. Requests going the other 
direction are not subjected to prefetching, however, since 
many PCI memory regions have "read side effects" (e.g., 
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slacks and FIFO's) the bridge never prefetches data into 
these buffers on behalf of the master, and USDCQ buffers 
are flushed as soon as their associated deferred reply com- 
pletes. 

The posted write queues each contain a control block 50 
or 60 and a dual port RAM memory 51 or 61, with each one 
of the buffers in these RAMs storing command and data for 
one write. Only memory writes are posted, i.e., writes to I/O 
space are not posted. Because memory writes flow through 
dedicated queues within the bridge, they cannot blocked by 
delayed requests that precede them; this is a requirement of 
the PCI specification. Each of the four buffers in DSPWQ 
50, 51 stores 32-Bytes of data plus commands for a host- 
to-PCI write; this is a cache line — the bridge might receive 
a cacheline -sized write if the system has a PCI video card 
that supports the p A USWC memory type. The four buffers 
in the DSPWQ 50, 51 provide a total data storage of 
128-Bytes. Each of the four buffers in USPWQ 60, 61 stores 
256 -Bytes of data plus commands for a PCI-to-host write; 
this is eight cache lines (total data storage=l-KByte). Burst 
memory writes that are longer than eight cache lines can 
cascade continuously from one buffer to the next in the 
USPWQ. Often, an entire page (e.g., 4-KB) is written from 
disk to main memory in a virtual memory system that is 
switching between tasks; for this reason, the bridge has more 
capacity for bulk upstream memory writes than for down- 
stream. 

The arbiters 53 and 63 control event ordering in the 
QBLKs 45 and 46. These arbiters make certain that any 
transaction in the DRQ 52 or 62 is not attempted until posted 
writes that preceded it are flushed, and that no datum in a 
DCQ is marked valid until posted writes that arrived in the 
QBLK ahead of it are flushed. 

Referring to FIG. 3a, the data and control signal protocol 
on the bus 15 is defined by the processors 11-14, which in 
the example are Intel "Pentium Pro" devices. The processors 
11-14 have a bus interface circuit within each chip which 
provides the bus arbitration and snoop functions for the bus 
15. A P6 bus cycle includes six phases: an arbitration phase, 
a request phase, an error phase, a snoop phase, a response 
phase, and a data phase. A simple read cycle where data is 
immediately available (i.e., a read from main memory 17) is 
illustrated in FIG. 3a. This read is initiated by first acquiring 
the bus; a bus request is asserted on the BREQn# line during 
Tl; if no other processors having a higher priority (using a 
rotating scheme) assert their BREQn#, a grant is assumed 
and an address strobe signal ADS# is asserted in T2 for one 
clock only. The address, byte enables and command signals 
are asserted on the A# lines, beginning at the same time as 
ADS#, and continuing during two cycles, T3 and T4, i.e., the 
asserted information is multiplexed onto the A# lines in two 
cycles. During the first of these, the address is applied, and 
during the second, the byte enables and the commands are 
applied. The error phase is a parity check on the address bits, 
and if a parity error is detected an AERR# signal is asserted 
during T5, and the transaction aborts. The snoop phase 
occurs during T7; if the address asserted during T3 matches 
the tag of any of the 12 cache lines and is modified, or any 
other resource on bus 15 for which coherency is maintained, 
a HFTM# signal is asserted during T7, and a writeback must 
be executed before the transaction proceeds. That is, if the 
processor 11 attempts to read a location in main memory 17 
which is cached and modified at that time in the L2 cache of 
processor 12, the read is not allowed to proceed until a 
writeback of the line from 12 of processor 12 to memory 17 
is completed, so the read is delayed. Assuming that no parity 
error or snoop hit occurs, the transaction enters the response 
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phase during T9. On lines RS[2:0]#, a response code is during the request phase, and after the response agent 

asserted during T9; the response code indicates "normal completes the original request, it must issue a matching 

data/' "retry/* "deferred," etc., depending on when the data deferred-reply bus transaction, using the deferred ID as the 

is going to be available in response to the read request. address in the reply transaction's request phase. The 

Assuming the data is immediately available, the response 5 deferred ID is eight bits transferred on pins Ab[23:16] in the 

code is "normal data" and the data itself is asserted on data second clock of the original transaction's request phase, 

lines D[63:0]# during T9 and T12 (the data phase); usually A read transaction on the PCI bus 20 (or 21) is illustrated 

a read request to main memory is for a cache line, 128-bytes, in 3f- It is assumed that the bus master has already 

so the cache line data appears on the data lines during two arbitrated for and been granted access to the bus. The bus 
cycles, 64-bytes each cycle, as shown. The data bus busy 10 master must lhen wait for the bus 10 become idle, which is 

line DBSY# is sampled before data is asserted, and if free done b y sampling FRAME# and IRDY# on the rising edge 

then the responding agent asserts DBSY# itself during of each clock ( alon S with GNT#); when both are sampled 

T9-T11 to hold the bus, and asserts data ready on the deasserted, the bus is idle and a transaction can be initiated 

DRDY# line to indicate that valid data is being applied to the b y lne Dus master At start of clock Tl, the initiator asserts 
data lines. 15 FRAME#, indicating that the transaction has begun and that 

Several read requests can be pending on the bus 15 at the a Valid Start address * nd ^* mand are on th< \ bus - mA ^ EU 

same time. TTiat is, another request can be asserted by any muJ f remam a ^rted until the initiator is ready to complete 

agent which is granted the bus (the same processor, or by a * c Ia ? data P hase * ^ the ^^lator asserts FRAME#, it 

different processor), during T5, indicated by dotted lines for also dnves the ^art address onto the AD bus and the 

the ADS# signal, and the same sequence of error, snoop, 20 "> n X W* ontc > me Com man d/Byte Enable lines, C/BE 

response, and data phases would play out in the same order, A turn -around cycle (r.e. a dead cycle) is required on 

as discussed. Up to eight transactions can be pending on the aU S1 t gnals tha * ma * be dnve A n b * more th f <"» b " s 

buslSatonetime.Thetransactiooscompletemorderunless a ^ nt ; 10 avoid collisions. At the start of clock T2, the 

they are split with a deferred response. Transactions that ?f ator M f e * dnv ^K ^ ^ u^ mg ^ taf f°J t0 

receive a deferred response may complete out of order. 25 f e control of the AD bus to drive the first requested data 

A . , . . , * t ^, . .„ , item back to the initiator. Also at the start of clock T2, the 

A simple write transaction on the P6 bus 15 is illustrated initiator W9m to drive the command onto the C/BE lines 

in FIG. 36 As m a read transaction, after being granted the and uses them to indicate the b tes tQ be transferred in the 

bus,mT3the^tiatorassertsADS#andassertsth e REQaO# curreotly addressed dou bleword (typically, all bytes are 

( ro " m J^^ 30 ***** during a read). ITie initiator also asserts IRDY# 

in T6 TRDY# is active andDBSY# is inactive in T8, so data duri „ t0 i{ \ s fead tQ ^ fifSt daU ^ 

transfer can begin in T9; DRDY# is asserted at this time. The from the t t ^ initiator asserts TRDY# and desserts 

initiator dnves data onto the data bus D[63:0]# during T9. FRAME# to indicate that it is ready to complete the last data 

Aburst or full-speed read transaction is illustrated in FIG. phase (T5 in FIG. 3/). During clock T3, the target asserts 

3c. Back-to-back read data transfers from the same agent 35 DEVSEL# to indicate that it recognized its address and will 

with no wait states. Note that the request for transaction-4 is participate in the transaction, and begins to drive the first 

being driven onto the bus while data for transaction-1 is just data item onto the AD bus while it asserts TRDY# to 

completing in T10, illustrating the overlapping of several indicate the presence of the requested data. When the 

transactions. DBSY# is asserted for transaction-1 in T7 and initiator sees TRDY# asserted in T3 it reads the first data 

remains asserted until T10. Snoop results indicate no 4Q item from the bus. The initiator keeps IRDY# asserted upon 

implicit writeback data transfers so TRDY# is not asserted. entry into the second data phase in T4, and does not deassert 

Likewise, a burst or full-speed write transaction with no FRAME#, indicating it is ready to accept the second data 

wait states and no implicit writebacks is illustrated in FIG. item. In a multiple data phase transaction (e.g., a burst), the 

3d TRDY# for transaction-2 can be driven the cycle after target latches the start address into an address counter, and 

RS[2 :0]# is driven. In Til, the target samples TRDY# active 45 increments this address to generate the subsequent 

and DBSY# inactive and accepts data transfer starting in addresses. 

T12. Because the snoop results for transaction-2 have been Referring now to FIG. 4, the processor bus 15 is con- 
observed in T9, the target is free to drive the response in T12. nected through the processor bus interface 43, the upstream 
A deferred read transaction is illustrated in FIG. 3e. This queue block 45, the downstream queue block 46 and the PCI 
is a split transaction, meaning the request is put out on the 50 interface 44 to the PCI bus 20. The processor bus interface 
bus, then at some time later the target initiates a reply to 43 includes a processor bus initiator 60 and a processor bus 
complete the transaction, while other transactions occur on target 62. The target 62 is capable of two-way communica- 
the bus in the intervening time. Agents use the deferred ti° ns w i tn tnc upstream queue block 45. Similarly, the PCI 
response mechanism of the P6 bus when an operation has initiator 66 communicates with the downstream queue block 
significantly greater latency than the normal in-order 55 46. The PCI initiator 66 and PCI target 64 are part of the PCI 
response. During the request phase on the P6 bus 15, an interface 44. 

agent can assert Defer Enable DEN# to indicate if the Referring to FIG. 5, the PCI initiator 66 and processor 

transaction can be given a deferred response. If DEN# is target 62 are shown to better explain the relationship with 

inactive, the transaction cannot receive a deferred response; certain transactions on the PCI bus 20. Also depicted is a 

some transactions must always be issued with DEN# 60 configuration module 68 which may be implemented as part 

inactive, e.g., bus-locked transactions, deferred replies, of the PCI bus interface 44. It may include configuration, 

writebacks. When DEN# is inactive, the transaction may be diagnostic and/or memory mapped registers, 

completed in-order or it may be retried, but it cannot be The PCI initiator 66 may initiate a transaction which 

deferred. A deferred transaction is signalled by asserting originally was run on the processor bus 15 and which is 

DEFER# during the snoop phase followed by a deferred 65 transferred to the PCI initiator 66 from the processor bus 

response in the response phase. On a deferred response, the target 62. As indicated in FIG. 5, the processor bus target 62 

response agent must latch the deferred ID, DID[7:0]#, issued may receive a request for a transaction and ultimately 
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provide a response to the processor bus 15. The processor 
target 62 sends the request through the queue block 45 to the 
PCI initiator 66. The PCI initiator 66 then runs the transac- 
tion on the line 70. The transaction on the line 70 passes 
through a multiplexor 72 to a bi-directional buffer including 5 
the buffers 74 and 76 and out to the PCI bus 20. The same 
transaction may be bidirectionally routed through the ampli- 
fier 76 to a second multiplexor 78. 

In certain instances, a new transaction from the initiator 
66 would be passed directly through the second multiplexor i° 
78 to the configuration module 68 via the path 79. One 
instance where this would occur would be when the con- 
figuration module 68 was the ultimate target of the transac- 
tion being run by the PCI initiator 66. The transaction may 
then also be run on the PCI bus 20. 15 

In regular transactions data may be returned through the 
line 81 to the line 83. It may then be blocked by the 
multiplexor 78 which is switched to only accept inputs from 
the PCI initiator 66. 

The multiplexor 72 is controlled by a signal on the line 84. 20 
The multiplexor 72 may be switched by a signal on the line 
84 to allow either the PCI initiator 66 or the configuration 
module 68 via the line 86, to control the PCI bus 20. 

Similarly, the multiplexor 78 is controlled by a signal 
issued from the PCI initiator 66 over the line 88. When 2 
desired, the multiplexor 78 may be operable to reject a 
bidirectional signal from the buffer 76 and to simply pass the 
original PCI initiator transaction from the line 70 directly to 
the configuration module 68. In this way, a transaction 3Q 
initiated from the PCI initiator 66 may be run on the PCI bus 
20. However, the transaction is also directed straight to the 
configuration module 68 when it is the intended target. 

The configuration module 68 may then respond with data 
over the lines 86 and 98 to the P6 target 62. Ultimately, this 35 
information may get back to the P6 bus 15. In some 
instances, the data may also be provided, via line 86, to the 
PCI bus 20 through the multiplexor 72. 

When the PCI bus is in a hang or other error condition, 
there may be critical information stored in the module 68 40 
which could not be accessed via the PCI bus 20. In this case, 
direct access to the configuration module 68, without using 
the PCI bus 20, allows critical information to be obtained. 
The information stored in the module 68 could include a 
listing of recent transactions including the initiator, the target 45 
and the type of command that was involved. This informa- 
tion is useful in determining the cause of the hang condition 
on the PCI bus 20. It may be utilized to attempt to diagnose 
the problem and in some cases to even correct the problem 
without requiring a system shut down. 50 

By running the transaction on the PCI bus at the same 
time the transaction is directly shunted to the module 68, 
control over the bus 20 may be maintained. As a result, 
additional transactions will not occur which would simul- 
taneously target the module 68. Moreover, bus visibility is 55 
achieved which may be useful in various operations, includ- 
ing debugging. 

Alternatively, transactions from the PCI initiator 66 could 
be run both on the bus 20 and directly through the module 
68 with the return path controlled by the module 68. For 60 
example, the module 68 could switch the multiplexer 96 (by 
a path not shown) when the module 68 claims the cycle. 

When the transaction, which is actually being shunted 
directly to the configuration module 68, is run on the PCI bus 
by the PCI initiator 66, it may be necessary to mask the ID 65 
select signal which would identify the particular target 
device. This ID select signal would correspond to a multi- 
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plexed address in normal PCI terminology. On the PCI bus 
20 during configuration cycles, the initiator asserts one 
address line that corresponds to the target of the configura- 
tion cycle. The signal identifies the target device and there- 
fore initiates a response by the target device. Since it would 
be undesirable for any target to respond (since the configu- 
ration module 68 is being addressed directly), this signal 
from the initiator 66 is masked by the logic gate 94. When 
the logic gate 94 receives a signal on the line 90 indicating 
that a direct cycle to the configuration module 68 is being 
run, the master ID select signal on the line 92 is blocked. 
This makes all other devices ignore the configuration access 
that was run on the bus 20. Thus, no target responds to the 
PCI bus transaction. The arbiters 53, 63 keep the transac- 
tions in sync with one another. During normal transactions, 
the master ID select signal would issue on the bus 20. 

The bidirectional signal from the buffer 76 may, under 
certain circumstances, be passed by the multiplexor 96 to the 
processor target device 62. Control over the switching 
operation of the multiplexor 96 may also be obtained via the 
line 90. Similarly, data and control signals outputted from 
the configuration module 68 may be shunted directly to 
upstream queue block 45 and the processor target 62 via the 
line 98 when the multiplexor 96 is in the appropriate 
configuration. The signal on the line 90 used to the control 
the ID select signal also controls the multiplexor 78 and the 
multiplexor 96. 

The configuration module 68 and bridge 18 or 19 may be 
implemented on one semiconductor die. Alternatively, they 
may be separate, integrated circuits. 

The present invention may enable the diagnosis and repair 
of bus faults. For example, the configuration module 68 
could include a FIFO buffer which stores information about 
transactions that have occurred previously. For example, the 
buffer may be a given number of spaces deep and that given 
number of transactions are stored such that the last several 
transactions are stored in a shorthand format in the buffer. If 
the bus hangs, information about the last several transactions 
can be analyzed. Generally, the failure condition would be 
detected by a watch dog timer time out indicating that no 
valid data transfer happened on the bus for a predetermined 
amount of time (e.g., 2 18 clock cycles). The transaction 
could then be terminated by asserting STOP# followed by 
target abort, taking the device off of the bus. A reset could 
be utilized to see if the bus hang condition had been 
remedied. If not, an analysis could be made using the stored 
transaction information in the buffer to determine what was 
the last device that was involved before the problem arose. 
The faulty device could then be electronically disconnected 
from the system and a message could be provided indicating 
that the faulty device should be replaced. Since the device 
has now been disconnected it would be possible to continue 
operation of the bus. A system for implementing such a bus 
watching functionality is described in a copending U.S. 
patent application entitled "Fault Isolation," Ser. No. 08/658, 
750, filed Jun. 5, 1996, in the name of Alan L. Goodrum et 
al, hereby expressly incorporated by reference herein. 

If the stored information about the transactions were 
inaccessible because of the bus hang condition, there would 
be no benefit from storing the transactions. Thus, it is 
advantageous to have a system for enabling such a buffer to 
be accessed through an alternative route when a bus fault 
occurs. Those skilled in the art will appreciate a number of 
other circumstances where it is desirable to be able to access 
critical information by a separate path not dependent on the 
active status of any particular bus. 

The use of the internal direct path also may eliminate the 
worst case reflections. Each device on the bus 20, 21 
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downstream from a bridge 18, 19, shown in FIG. 1, receives 
a signal and a reflected signal from the end of the bus 20, 21 
farthest away from the bridge. The delay from the bridge 
back to the bridge is the longest delay. Thus, the signal 
quality is poorest for signals from the bridge 18, 19 back to 5 
the same bridge 18, 19. The internal direct path may 
eliminate these worst case reflections. For this purpose, it is 
advantageous to use the internal direct path for configuration 
and memory transactions. 

While the present invention has been described with 10 
respect to the single preferred embodiment, those skilled in 
the art will appreciate a number of modifications and varia- 
tions therefrom. It is intended that the appended claims 
cover all such modifications and variations as fall within the 
true spirit and scope of the present invention. 15 

APPENDIX A — P6 Bus Signals 

ADS# — Address Strobe, asserted to defines the beginning 
of the Request Phase. The REQa[4:0]# and Aa[35:3]# 20 
signals are valid in the clock that ADS# is asserted (the 
"a") clock). The REQb[4:0]# and Ab[35:3]# signals are 
valid in the next clock after ADS# was asserted (the 
"b") clock). 

A[35.3]# — Address signals, conveying information dur- 25 
ing both clocks of two-clock request phase. Aa[35:3]# 
are signals during first clock and Ab[35:3]# are signals 
during second clock. Aa[35:3]# convey address, and 
Ab[35:3]# convey cycle-type, byte enable, deferred ID, 
etc. 30 

D[63:0]# — System Data signals — carry the data for a 
transaction during the data phase. 

REQ[4:0]# — Request command signals, asserted during 
both clocks of the request phase, indicating type of 
transaction being requested and info about that trans- 35 
action. 

RS[2:0]# — Response status signals, driven by the target 
during the response phase, indicate how current trans- 
action will be processed. Valid responses include: Nor- 4Q 
mal with or without data; Idle; Retry; Defer; Implicit 
Writeback. 

DBSY# — Data bus busy signal, asserted by the agent 
driving the data on D[63:0]# to indicate a multi -clock 
data phase. Asserted on first clock that data may be 45 
driven, deasserted when the data bus is no longer 
needed. 

DEFER# — Defer signal, used by target to indicate to the 
agent issuing the transaction that it may not be com- 
pleted in order. An active DEFER# means that the 50 
associated transaction will complete with a DEFER 
REPLY or a RETRY response. 

DRDY# — Date ready signal, driven in same clock as the 
D[63:0]# signals and indicates that the data is valid and 
may be sampled. 55 

TRDY# — Target ready signal, driven by the target for 
write transactions to indicate that target is ready to 
accept the current data for a write or writeback. 

HIT#— Cache hit signal for snooping, along with HITM# 6Q 
determine results of snoop phase. HITM# is the cache 
hit to modified signal. 

AERR# — Address parity error, driven during error phase. 

GNTn# — Arbiter grant signal to master, indicating initia- 
tor is granted the bus. 65 

LOCK# — Bus lock signal, asserted from the request 
phase of the first transaction through the response phase 



,137 

14 

of the final transaction. No other bus masters may issue 
transactions during a bus lock. Locked cycle may be 
stopped on the first transaction if DEFER# is asserted, 
or by error signals. 

APPENDIX B— PCI Bus Signals 

AD[31:0]— Address/Data (with AD[63:32] for 64-bit 
bus)conveys the address for a read or write request, 
then Used to transfer data. 

C/BE#[3:0]— Command/Byte Enable (with C/BE#[7:4] 
for 64-bit bus) — conveys bus commands in first phase 
then byte enables in later phases. 

FRAME# — Asserted by master to begin a transaction. 
Held in asserted state until transaction is completed. 

TRDY# — Target Ready — indicates that target is ready to 
receive or transfer data. 

IRDY# — Initiator Ready — indicates that master or initia- 
tor of transaction is ready to send or receive data. 

DEVSEL# — Device Select — indicates driving device has 
decoded its address as the target of the current access. 
As an input, it indicates whether any device on the bus 
has been selected. 

STOP# — Target is requesting the master to stop the 
current bus transaction. Aborts, 

REQ# — Request — arbitration signal asserted by an ini- 
tiator when requesting the bus. 

GNT# — Grant — signal from arbiter to agent in response 
to REQ#, indicating that bus has been granted to 
agent — one of six signals with one going to each 
device. 

LOCK# — Atomic operation, may require multiple trans- 
actions to complete, asserted when transaction must be 
completed before any other transaction can be initiated. 
Only supported going downstream. 

What is claimed is: 

1. A computer system comprising: 
a processor; 

a hard disk drive coupled to said processor; 

a bridge coupled to first and second buses, the second bus 
coupled to said processor; 

an alternate path for communicating transactions; 

a target in said bridge coupled to said first bus and said 
alternate path; and 

an initiator coupled to said first bus and said alternate 
path, said initiator causing transactions to be simulta- 
neously driven over said first bus and over the alternate 
path to said target in the bridges, said initiator causing 
all transactions directed to said target in the bridge to 
run through said alternate path and causing addressing 
information of said transactions to be masked on said 
first bus. 

2. The system of claim 1 wherein said first bus is a PCI 
bus. 

3. The system of claim 1 wherein said bridge is formed on 
a semiconductor die and said target in said bridge is situated 
on the same die as said bridge. 

4. The system of claim 1 wherein said bridge target 
comprises a configuration module including a store. 

5. The system of claim 4 wherein said configuration 
module stores information which is useful in diagnosing an 
error condition in said first bus. 

6. The system of claim 1 including a device for enabling 
information to be obtained from said bridge target when said 
first bus is not working correctly. 
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7. A method of processing transactions between an ini- 
tiator and a target on the same bus comprising the steps of: 

initiating a transaction from said initiator intended for said 
target; 

receiving said transaction in a bridge; 5 
communicating said transaction over said bus and simul- 
taneously to said target in the bridge over an alternate 
path; and 

causing all transactions directed to said target in the 10 
bridge to be run directly over said alternate path 
without using said bus while masking addressing infor- 
mation of said transaction on said bus. 

8. The method of claim 7 including the step of issuing the 
transaction from the bridge to the bus. 15 

9. The method of claim 7 including the step of obtaining 
information useful in diagnosing bus faults by accessing 
information stored in a register in said target in the bridge. 

10. A method of processing transactions between an 
initiator and a target on the same bus comprising the steps 
of: 
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initiating a transaction from said initiator directed to said 

target on said bus; 
receiving said transaction in a bridge which also includes 

a target; 

issuing the transaction from the bridge to said target on 
said bus and simultaneously driving the transaction to 
said target in said bridge over an alternate path that 
does not include said bus; and 

masking addressing information of a said transaction on 
said bus when said target in the bridge is addressed 
directly as an intended target. 

11. The method of claim 10 including the step of obtaining 
information useful in diagnosing system faults by accessing 
information stored in a configuration module register in said 
target in said bridge. 

12. The method of claim 11 wherein the masking step 
includes the step of masking an ID select signal for the target 
on said bus. 
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